Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
نویسندگان
چکیده
Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
منابع مشابه
Convergence among cave catfishes: long-branch attraction and a Bayesian relative rates test.
Convergence has long been of interest to evolutionary biologists. Cave organisms appear to be ideal candidates for studying convergence in morphological, physiological, and developmental traits. Here we report apparent convergence in two cave-catfishes that were described on morphological grounds as congeners: Prietella phreatophila and Prietella lundbergi. We collected mitochondrial DNA sequen...
متن کاملError, bias, and long-branch attraction in data for two chloroplast photosystem genes in seed plants.
Sequences of two chloroplast photosystem genes, psaA and psbB, together comprising about 3,500 bp, were obtained for all five major groups of extant seed plants and several outgroups among other vascular plants. Strongly supported, but significantly conflicting, phylogenetic signals were obtained in parsimony analyses from partitions of the data into first and second codon positions versus thir...
متن کاملMolecular phylogenetics of the allodapine bee genus Braunsapis: A-T bias and heterogeneous substitution parameters.
Extreme AT bias in Hymenopteran mitochondrial genes have created difficulties for molecular phylogenetic analyses, especially for older divergences where multiple substitutions can erode signal. Heterogeneity in the evolutionary rates of different codon positions and different genes also appears to have been a major problem in resolving ancient divergences in allodapine bees. Here we examine th...
متن کاملImproved Bayesian Phylogenetic Inference in a Statistical Alignment Framework Advanced Software Design for StatAlign
Long-term trends in computational phylogenetics show a steady transition of focus from traditional tree reconstruction methods towards Bayesian approaches. The early distance based techniques such as UPGMA and Neighbour Joining are today considered less accurate primarily due to the loss of information when condensing sequence data into a distance matrix. Maximum parsimony is fast but suffers f...
متن کاملTwisted trees and inconsistency of tree estimation when gaps are treated as missing data - The impact of model mis-specification in distance corrections.
Statistically consistent estimation of phylogenetic trees or gene trees is possible if pairwise sequence dissimilarities can be converted to a set of distances that are proportional to the true evolutionary distances. Susko et al. (2004) reported some strikingly broad results about the forms of inconsistency in tree estimation that can arise if corrected distances are not proportional to the tr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 4 شماره
صفحات -
تاریخ انتشار 2009